Approximate Dimension Reduction at NTCIR

نویسندگان

  • Fan Jiang
  • Michael L. Littman
چکیده

We carried out a comparison of cross-language retrieval methods on the NTCIR-1 data based on dimension reduction (latent semantic indexing). These methods all use a collection parallel documents (translations or approximate translations) and very little, if any, linguistic knowledge. In NTCIR-1, we compared latent semantic indexing, local LSI, and approximate dimensional equalization (ADE). We found that local LSI and ADE performed the best on this collection and were comparable to the best performing systems reported elsewhere. We also ran ADE on the NTCIR-2 and found it fared considerably less well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Alicante at NTCIR-9 GeoTime

In this paper we present a complete system for the treatment of the geographical dimension in the text and its application to information retrieval. This system has been evaluated in the GeoTime task of the 9th NTCIR workshop, making it possible to compare the system with other current approaches to the topic. In order to participate in this task we have added to our GIR system the temporal dim...

متن کامل

Computational Methods For Functional Motif Identification and Approximate Dimension Reduction in Genomic Data

Computational Methods For Functional Motif Identification and Approximate Dimension Reduction in Genomic Data by Stoyan Georgiev Department of Computational Biology and Bioinformatics Duke University

متن کامل

IBM Team at NTCIR-10 RITE2: Textual Entailment Using Temporal Dimension Reduction

Our system for the Japanese BC/EXAM subtasks in NTCIR10 RITE2 is an extension of our previous system for NTCIR9 RITE. The new techniques are (1) Case-aware noun phrase matching using ontologies: The motivation of the feature is to capture finer syntactic structures than simple word matching. We uses ontologies to allow flexible matching of noun phrases. (2) Temporal expression matching after ma...

متن کامل

User Satisfaction Task: A Proposal for NTCIR-7

Good test collections, coupled with good evaluation metrics, are very useful for evaluating Information Access systems efficiently. But useful to whom? The in vitro (or Cranfield) evaluation paradigm has been criticised, mainly because of the absence of the user. On the other hand, user-in-the-loop evaluations are expensive, unrepeatable and often inconclusive. In light of this, we propose a ne...

متن کامل

An Improved Patent Machine Translation System Using Adaptive Enhancement for NTCIR-10 PatentMT Task

This paper describes the work that we conducted for the Chinese-English (CE) task of the NTCIR-10 patent machine translation evaluation. We built standard phrase-based and hierarchical phrase-based statistical machine translation (SMT) systems with optimized word segmentation, adaptive language model and improved parameter tuning strategy. Our systems outperform official baselines by approximat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001